Skip to content

[SPARK-57681][SQL] Support dynamic table options for UPDATE#56792

Open
anuragmantri wants to merge 1 commit into
apache:masterfrom
anuragmantri:delete-update-sql-options
Open

[SPARK-57681][SQL] Support dynamic table options for UPDATE#56792
anuragmantri wants to merge 1 commit into
apache:masterfrom
anuragmantri:delete-update-sql-options

Conversation

@anuragmantri

@anuragmantri anuragmantri commented Jun 25, 2026

Copy link
Copy Markdown

What changes were proposed in this pull request?

This PR add the dynamic table options syntax to UPDATE statements.

Why are the changes needed?

This is a continuation to adding options syntax in DMLs SPARK-49098

Does this PR introduce any user-facing change?

Yes, it adds new SQL syntax. For example:

 UPDATE catalog.db.table WITH (`k` = 'v' )
 SET status = 'archived' WHERE created_at < '2024-01-01'.

How was this patch tested?

Added tests to the following suites.

  • Parser (DDLParserSuite)
  • End-to-end (UpdateTableSuiteBase)
  • LogicalWriteInfo (InMemoryRowLevelOperationTable)

Was this patch authored or co-authored using generative AI tooling?

I used Claude Code (Claude Opus 4.8) to generate the code and tests and verified manually.

@anuragmantri

Copy link
Copy Markdown
Author

@szehon-ho, tagging you since you made the INSERT counterpart :)

// don't rewrite as the table supports truncation
d

case r @ ExtractV2Table(t: SupportsRowLevelOperations) =>

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please consider this concern: silent option-drop on non-rewrite delete paths: for a table implementing both SupportsRowLevelOperations and SupportsDeleteV2 (e.g. Iceberg), a DELETE … WITH(...) WHERE can take the metadata-only/deleteWhere path (OptimizeMetadataOnlyDeleteFromTable / DataSourceV2Strategy), which has no options parameter; so the user's options are silently ignored. Same for the TruncatableTable truncate path. This would be a new user-visible "silently ignored" surprise, and no test currently covers this path.

@anuragmantri anuragmantri Jun 27, 2026

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch. I looked at those paths. The options are indeed not being passed all the way. Fixing this requires DSv2 API changes something like this with default methods.

  // SupportsDeleteV2
  default void deleteWhere(Predicate[] predicates, CaseInsensitiveStringMap options) {
    deleteWhere(predicates);   // default ignores options → fully back-compatible
  }
  // TruncatableTable
  default boolean truncateTable(CaseInsensitiveStringMap options) {
    return truncateTable();
  }

I can make this change in this same PR or can do a follow up if DSv2 changes need separate PRs. Let me know what you think is the best way forward.

| UPDATE identifierReference tableAlias setClause whereClause? #updateTable
| DELETE FROM identifierReference optionsClause? tableAlias whereClause? #deleteFromTable
| UPDATE identifierReference optionsClause? tableAlias setClause whereClause? #updateTable
| MERGE (WITH SCHEMA EVOLUTION)? INTO target=identifierReference targetAlias=tableAlias

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this PR makes INSERT/DELETE/UPDATE accept WITH(...), but how about MERGE?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I intentionally did not add MERGE as the semantics are a bit different since MERGE statements have source and target relations, I'm still thinking about how options would look like there. I can file a separate JIRA and PR for MERGE later.

stop = 56))
}

test("delete from table: with options") {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make sure to either update the SQL reference docs, or file a followup Jira ticket to do so later.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sure. I took a look and looks like INSERT docs are also not updated. I will create a follow up JIRA to update all options related DMLS at once.

@anuragmantri

Copy link
Copy Markdown
Author

@dongjoon-hyun - Could you please review this PR since you had reviewed its precedent (SPARK-49098) as well? Thank you!

@dongjoon-hyun dongjoon-hyun left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for pinging me, @anuragmantri .

In general, given the code contribution, the claim of this PR title is too broad. When you make a PR, please revise the PR title by narrowing down as much as possible to the exact contribution scope.

We don't need to add MERGE here. Instead, I'd like to recommend to proceed DELETE FROM syntax in another JIRA issue after removing this PR because it could be misleading if we ignore options. For UPDATE syntax, IIUC, there is no silent ignoring code path.

Lastly, follow-up have a special meaning in the community. So, please don't say follow-up of something especially when the Fix Version is different.

If you narrow down the scope according to the above, I believe we can get a higher chance of merge to Apache Spark 4.3.0, @anuragmantri .

@anuragmantri anuragmantri force-pushed the delete-update-sql-options branch from 398cadc to 50072c3 Compare July 1, 2026 20:30
@anuragmantri anuragmantri changed the title [SPARK-57681][SQL] Support dynamic table options for DELETE and UPDATE [SPARK-57681][SQL] Support dynamic table options UPDATE Jul 1, 2026
@anuragmantri anuragmantri changed the title [SPARK-57681][SQL] Support dynamic table options UPDATE [SPARK-57681][SQL] Support dynamic table options UPDATE Jul 1, 2026
@anuragmantri anuragmantri changed the title [SPARK-57681][SQL] Support dynamic table options UPDATE [SPARK-57681][SQL] Support dynamic table options for UPDATE Jul 1, 2026
@anuragmantri

Copy link
Copy Markdown
Author

Thanks for the valuable feedback @dongjoon-hyun. As you suggested, I have narrowed the scope of this JIRA and the PR to only UPDATE as it is the most unambiguous. I will work on the other DMLs and the docs on separate JIRAs.

This is ready for another round. Thanks!

@dongjoon-hyun dongjoon-hyun left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens for the following queries?

UPDATE t WITH (...) ...
WHERE id IN (SELECT ... FROM t)

We don't consider these new options for analyzer relation caches, do we?

case CatalogAndIdentifier(catalog, ident) =>
val key = toCacheKey(catalog, ident, finalTimeTravelSpec)
val planId = u.getTagValue(LogicalPlan.PLAN_ID_TAG)
relationCache
.get(key)

Given that, it seems that there are more corner cases with analyzer relation cache. Could you add more test coverage and verity them? Typically, Subquery and CTE are the corner cases we need to consider carefully, @anuragmantri .

| fromClause multiInsertQueryBody+ #multiInsertQuery
| DELETE FROM identifierReference tableAlias whereClause? #deleteFromTable
| UPDATE identifierReference tableAlias setClause whereClause? #updateTable
| UPDATE identifierReference optionsClause? tableAlias setClause whereClause? #updateTable

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this seems to follow read path, this is inconsistent with our existing INSERT syntax.

| INSERT (WITH SCHEMA EVOLUTION)? INTO TABLE? identifierReference tableAlias optionsClause? (BY NAME)?
REPLACE (WHERE | ON) replaceCondition=booleanExpression #insertIntoReplaceBooleanCond
| INSERT (WITH SCHEMA EVOLUTION)? INTO TABLE? identifierReference tableAlias optionsClause? (BY NAME)?
REPLACE USING identifierList #insertIntoReplaceUsing

UPDATE should follow INSERT style.

stop = 70))
}

test("update table: with options") {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use test prefix when it's possible.

- test("update table: with options") {
+ test("SPARK-57681: update table: with options") {

@dongjoon-hyun

dongjoon-hyun commented Jul 1, 2026

Copy link
Copy Markdown
Member

I finished the second-round review, @anuragmantri .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants